-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement "Improving Text Embeddings with LLMs" #683
Conversation
CodSpeed Performance ReportMerging #683 will not alter performanceComparing Summary
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alvarobartt, I'm fine with the naming. Maybe I would prefer moving the prompts to jinja templates as we have with other cases, but looks good to me anyway!
Yes, see the |
Description
This PR implements all the tasks mentioned in the paper
Improving Text Embeddings with Large Language Models
, so that one can reproduce the data generation process for training embedding models withsentence-transformers
.Closes #682
Example
Find a complete example below with all the tasks implemented and how to connect them:
What's missing?
structured_output
arg within theInferenceEndpointsLLM
as an example